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Abstract 

Recent surge of interests in cognitive assessment has led to the developments of novel sta- 
tistical models for diagnostic classification. Central to many such models is the well-known 
Q-matrix, which specifies the item-attribute relationship. This paper proposes a principled esti- 
mation procedure for the Q-matrix and related model parameters. Desirable theoretic properties 
are established through large sample analysis. The proposed method also provides a platform 
under which important statistical issues, such as hypothesis testing and model selection, can be 
addressed. 
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1 Introduction 

Diagnostic classification models (DCM) are important statistical tools in cognitive diagnosis and 
have widespread applications in educational measurement, psychiatric evaluation, human resource 
development, and many other areas in science, medicine, and business. A key co mponent in many 
such models is the so-called Q-matrix, first introduced bv lTatsuokal (|1983l ); see also lTatsuokal (|2009l ) 
for a detailed coverage. The Q-matrix specifies the item-attribute r elationship, so that re sponses to 
items can reveal attributes configuration of the respondent. In fact, Tatsuoka ( 19831 . 20091 ) proposed 
the rule space method that is simple and easy-to-use. 

Flexible and sophisticated statistical models can be built around the Q-ma trix. Two such mod- 
els ar e the DINA model (Deterministic Input, Noisy Output "AND" gate; see I Junker and Siitsma . 
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) and the PINO mode l (Deterministic Input, Noisy Output "OR" gate; s ee ITemplinl . 



Templin and Hensonl. 120061). Othe r important developments c a n be f o und i n 



DiBello. Stout, and Roussod (Il995l): ljunker and Siitsmal (lioOlT): iHartd tooi): 
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1985); 
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Leighton. Gierl. and Hunkal tood ): ITemplinl fcOQOMChiu. Douglas, and Lilfcood ). iRupp. Templin. and Henson 
(120101 ) contains a comprehensive summary of many classical and recent developments. 

There is a growing literature on the statistical inference of Q-matrix based DC Ms that ad- 
dresses the issues of estimating item parameters when the Q-matrix is presp ecified ( Rupp , 20021 ; 
Henson and Templinl . 1200.4 IRoussos. Templin. and Hensonl . 120071 : IStoiitL l2007h . Having a correctly 
specified Q-matrix is crucial both for parameter estimation (such as the slipping, guessing proba- 
bility, and the attribute distribution) and for the identification of subje cts' underlying attributes . 
As a result, these approaches are sensitive to the choice of the Q-matrix ( Rupp and Templin . 20081 : 
de la Torrel . 120081 : Ide la Torre and Douglasl . 12004 ) . For instance, a misspecified Q-matrix may lead 
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to substantial lack of fit and, consequently, erroneous attribute identification. Thus, it is desirable 
to be able to detect misspecification and to obtain a data driven Q-matrix. 

In contrast, there has not been much work about es timation of the Q-niatrix. To our knowledge, 
the only rigorous treatment of the subject is given by lLiu. Xu. and Yingl (120111 ). which defines an 
estimator of the Q-matrix under the DINA model assumption and provides regularity conditions 
under which desirable theoretical prop erties are established. The work of this paper may be viewed 
as the continuation of lLiu et al.l (|201ll ) in the sense that it completes the estimation of the Q-matrix 
for the DINA model and extends the estimation procedure (as well as the consistency results) to 
the DINO model. The DINA and the DINO models impose rather different interactions among 
attributes. However, we show that there exists a duality between the two models. This particular 
feature is interesting especially for theoretical development, as it allows us to adapt the results and 
analysis techniques developed for the DINA model to the DINO model without much additional 
effort. This will be shown in our technical developments. 

The main contribution of this paper is two-fold. First, it provides a rigorous analysis of the 
Q-matrix for the DINA model when both t he slipping and g uessing parameters are unknown. This 
is a substantial extension of the results in iLiu et al.l ( 201ll ) which requires a complete knowledge 
of the guessing parameter. It gives a definitive answer to the estimability of the Q-matrix for the 
DINA model by presenting a set of sufficient conditions under which a consistent estimator exists. 
Second, we conduct a parallel analysis (to the analysis for the DINA model) for the DINO model. 
In particular, a consistent estimator of the Q-matrix for the DINO model and its properties are 
presented. Thanks to the duality structure, part of the intermediate results developed for the DINA 
model can be borrowed to the analysis of the DINO model. 

One may notice that our estimation procedure is in fact generic in the sense that it is im- 
plementable to a large class of DCMs besides the DINA and DINO models. In particular, the 
procedure is implementable to the NIDA (Noisy Inputs, Deterministic "And" Gate) model and the 
NIDO (Noisy Inputs, Deterministic "Or" Gate) model among others, though theoretical properties 
under such model specifications still need to be established. In addition to the estimation of the 
Q-matrix, we emphasize that the idea behind the derivations forms a principled inference frame- 
work. For instance, during the course of the description of the estimation procedure, necessary 
conditions for a correctly specified Q-matrix are naturally derived. Such conditions can be used 
to form appropriate statistics for hypothesis testing and model diagnostics. In that connection, 
additional developments (e.g. the asymptotic distributions of those statistics) are needed, but they 
are not the focus of the current paper. Therefore, the proposed framework can potentially serve as 
a principled inference tool for the Q-matrix in diagnostic classification models. 

This paper is organized as follows. Section [2] contains the main ingredient: presentation of the 
estimation procedures for both the DINA and DINO models and the statement of the consistency 
results. Section [3] includes further discussions of the theorems and various issues. The proofs of the 
main theorems in Section [2] and several important propositions are given in Section SJ The most 
technical proofs of two central propositions are given in the Appendix. 



2 Main results 

2.1 Notation and model specification 

The specification of the diagnostic classification models considered in this paper consists of the 
following concepts. 

Attribute: subject's underlying mastery of certain skills or presence of certain mental health 
conditions. There are k attributes and we use A = (A 1 , A k ) T to denote the vector of attributes, 
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where A 3 = 1 or 0, indicating presence or absence of the j-th attribute, j = 1, . . . , k. 

Responses to items: There are m items and we use R = (R , ...,R m ) to denote the vector 
of responses to them. For simplicity, we assume that R 3 £ {0, 1} is a binary variable for each 
j = 1, ...,m. 

Note that both A and R are subject specific. Throughout this paper, we assume that the 
number of attributes k is known and that the number of items m is always observed. 

Q-matrix: the link between the items and the attributes. In particular, Q = (Qij) m xk is an 
mx k matrix with binary entries. For each i and j, Qij = 1 indicates that item i requires attribute 
j and Qij = otherwise. 

We define capability indicator, £(A,<5), which indicates if a subject possessing attribute profile 
A is capable of providing a positive response to item i if the item-attribute relationship is specified 
by matrix Q. Different capability indicators give rise to different DCMs. For instance, 

Q) = 1(A 3 > Qij for all j = 1, k) (1) 

is associated with the DINA model, where 1 is the usual indicator function. The DINA model 
assumes conjunctive relationship among attributes, that is, it is necessary to possess all the at- 
tributes indicated by the Q-matrix to be capable of providing a positive response to an item. In 
addition, having additional unnecessary attributes does not compensate for the lack of the necessary 
attributes. The DINA model is particularly popular in the context of educational testing. 

Alternative to the "and" relationship, one may impose an "or" relationship among the at- 
tributes, resulting in the DINO model. The corresponding capability indicator takes the following 
form 

S t l DINO (A, Q) = l(there exists a j such that A 3 > Qij). (2) 

That is, one needs to possess at least one of the required attributes to be capable of responding 
positively to that item. 

The last ingredient of the model specification is related to the so-called slipping and guessing 
parameters. The names "slipping" and "guessing" arise from the educational applications. The 
slipping parameter is the probability that a subject (with attribute profile A) responds negatively 
to an item if the capability indicator to that item (,dina(A-,Q) = 1; similarly, the guessing pa- 
rameter refers to the probability that a subject's responds positively if his/her capability indicator 
£ixnvA(A, Q) = 0. We use s to denote the slipping probability and g to denote the guessing prob- 
ability (with corresponding subscript indicating different items). In the technical development, it 
is more convenient to work with the complement of the slipping parameter. Therefore, we define 
c = 1 — s to be the correctly answering probability, with Si and Cj being the corresponding item- 
specific notation. Given a specific subject's profile A, the response to item i under the DINA model 
follows a Bernoulli distribution 

P(R i = 1|A) = c ^o/jva(A,Q)^i— Cdina(A-'Q) ^ 
With the same definition of c, and gi, the response under the DINO model follows 

P(^R i = 1|A) = c f^ ,/JVO (A>Q)^ 1— £i>iivo (A,Q) ^ 

In addition, conditional on A, (R , R m ) are jointly independent. 

Lastly, we use subscripts to indicate different subjects. For instance, R r = (i?J, i?5?) T is the 
response vector of subject r. Similarly, A r is the attribute vector of subject r. With subjects, 
we observe Ri, Rat but not Ai, Ajy. Thus, we finished our model specification. 
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2.2 Estimation of the Q-matrix 

In this section, we develop a general approach to the estimation of the Q-matrix and item param- 
eters. We first deal with the DINA model and then, via introducing a duality relation, the DINO 
model. 

2.2.1 DINA model 

We need to introduce additional notation and concepts. Throughout the discussion, we use Q to 
denote the true matrix and Q' to denote a generic m x k binary matrix. 

Attribute distribution. We assume that the subjects are a random sample (of size N) from a 
designated population so that their attribute profiles, A r , r = 1, ...,N are i.i.d. random variables, 
with the following distribution 

P(A r = A)=p A , (5) 

where, for each A G {0, l} k , pj± G [0,1] and Y^APA = 1- We use P = (PA ■ ^ e {0, l} fc ) to 
denote the distribution of the attribute profiles. 

The T -matrix. The T-matrix is a non-linear function of the Q-matrix and provides a linear 
relationship between the attribute distribution and the response distribution. In particular, let 
T(Q) be a matrix of 2 k columns. Each column of T corresponds to one attribute profile A € {0, l} fe . 
To facilitate the description, we use binary vectors of length k to label the the columns of T(Q) 
instead of using ordinal numbers. For instance, the A-th column of T(Q) is the column that 
corresponds to attribute A. 

Let Ii be a generic notation for a positive response to item i. Let "A" stand for "and" com- 
bination. For instance, 1^ A Jj 2 denotes positive responses to both item i\ and %i- Each row of 
T(Q) corresponds to one item or one "and" combination of items, for instance, J^, 1^ A Jj 2 , or 
Xjj A ij 2 A For T(Q) containing all the single items and all "and" combinations, it has 2 m — 1 

rows. We will later say that such a T(Q) is saturated. 

We now proceed to the description of each row vector of T(Q). We define Bq(Jj) to be a 
2 k dimensional row vector. Using the same labeling system as that of the columns of T(Q), the 
A-th element of Bq(Ji) is defined as £, l DINA (A, Q), that is, this element indicates if a subject with 
attribute A is capable of responding positively to item i. Thus, Bg(Ii) is the vector indicating the 
attribute profiles that is capable of responding positively to item i. 

Using a similar notation, we define that 

B Q (I tl A...Al H ) = T l h=1 B Q (I th ), (6) 

where the operator "T^ =1 " is element-by-element multiplication from Bq^,^) to Bq(Ii t ). For 
instance, 

W = T l h=1 V h 

means that W j = Uh=i V h' where W = (W l , ...,W 2k ~ l ) and V h = (y£, vf' 1 ). Therefore, 
Bq^I^ A ... A is the vector indicating the attributes that are capable of responding positively 
to items The row in T(Q) corresponding to 1^ A ... A Jj, is Bq^^ A ... A /,,). 

a-vector. We let a be a column vector whose length is equal to the number of rows in T(Q). 
Each component in a corresponds to a row vector of T(Q). The element in a corresponding to 
Ijj A ...Ali, is iV/^A ...Ah /N, where Ni^^.^Ali denotes the number of people with positive responses 
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to items that is 



N I 
r=l j=l 



No slipping or guessing. We first consider a simplified situation in which both the slipping 
and guessing probabilities are zero. Under this special situation, ([3]) implies that 

K = ^dina( A t), i = 1, m; r = 1, N . 

In other words, the probabilistic relationship becomes a certainty relationship. We further let 
p = {p\ '■ A £ {0, l} k } be the (unobserved) empirical distribution of the attribute profiles, that 
is, 

1 N 

PA = ^E 1 ( Ar = A )- 

r=l 

Note that each row vector of T{Q) indicates the attribute profiles that are capable of responding 
positively to the corresponding item(s). Then, for each set of i±, ii, we may expect the following 
identity 

B Q (I tl A .... A I it )p, 



N 

where Bq is a row vector and p is a column vector. Therefore, thanks to the construction of T(Q) 
and vector a, in absence of possibility of slipping and guessing, we may expect the following set of 
linear equations holds 

T(Q)p = a. 

Note that p is not observed. The above display implies that if the Q-matrix is correctly specified 
and the slipping and guessing probabilities are zero, then the linear equation T(Q)p = a (with p 
being the variable) has at least one solution. For each binary matrix Q' , we define that 

S(Q')=mf\T(Q')p-a\, 

where the minimization is subject to the constraints that pj^ € [0,1] and J^A^A = ^ Based 
on the above results, we may expect that S(Q) = and therefore Q is one of the minimizers of 
S(Q). In addition, the empirical distribution p is one of the minimizers of \T(Q)p — a\. Therefore, 
we just derived a set of necessary conditions for a correctly specified Q-matrix. In our subsequent 
theoretical developments, we will show that under some circumstances these conditions are also 
sufficient. 



Illustrative example. To aid the understanding of the T-matrix, we provide one simple example. 
Consider the following 3x2 Q-matrix, 



Q = 





addition 


multiplication 


2 + 3 


1 





5x2 





1 


(2 + 3) x 2 


1 


1 



(7) 



and the contingency table of attributes 
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multiplication 

AA -.- Poo Poi 

addition 

Pio Pn 

Note that if the Q-matrix is correctly specified and the slipping and guessing probabilities are all 
zero we should be able to obtain the following identities 



N(pio+p n )=N h , N(p 0l + Pu ) = N h , Np n 
We then create the corresponding T-matrix and a-vector as follows 



N h 



T(Q) 



10 1 
11 
1 



a 



NiJN 
NiJN 
NiJN 



(8) 



(9) 



The first column of T(Q) corresponds to the zero attribute profile; the second corresponds to 
A = (1, 0); the third corresponds to A = (0, 1); and the last corresponds to A = (1, 1). The first 
row of T(Q) corresponds to item 2 + 3, the second to 5 x 2, the third to (2 + 3) x 2. In addition, 
we may further consider combinations such as 

Npn = N hAh . 

The corresponding T-matrix and a-vector should be 



T(Q) 



( 1 1 \ 

11 

1 

\ o o o i J 



a 



Under the DINA model assumption and g\ = Sj = 1 



/ N h /N \ 
NiJN 
NiJN 
\ N hAh /N J 

0, we obtain that 



(10) 



T(Q)p = a. 



Nonzero slipping and guessing probabilities. We next extend the necessary conditions just 
derived to nonzero but known slipping and guessing probabilities. To do so, we need to modify the 
T-matrix. Let T cg (Q) be a matrix with the same dimension as that of T{Q), with each row vector 
being defined slightly differently to incorporate the slipping and guessing probability. In particular, 
let 

Bc,g,Q(h) = {ci - 9i)B Q {Ii) + g t E 

where E = (1, 1) is the row vector of ones and Cj is the positive responding probability of item 
i. In addition, we let 

B CjgjQ (I h A ...A J*,) = TLiiWJJ, (11) 

Clearly, each element of i? c . 9> Q(J) is the probability of observing a positive response to item i for 
a certain attribute profile. Likewise, elements of BcgQ^I^ A ... A J ; ) indicate the probabilities of 
positive responses to items i[. The row in T C;g (Q) corresponding to J^ A ... A J ; is B^g^J^ A 

... A JjJ. To facilitate our statement, we define that 

T c (Q)=T Cj0 (Q), (12) 
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where = (0, ...,0) T is the zero vector. That is, T C (Q) is the matrix T c ^ g (Q) with guessing proba- 
bilities being zero. 

Recall that p is the attribute distribution. Thus, 

P(R 1 ' = 1, & = 1) = E{P(R^ = 1, = 1|A)) = B CigiQ (I h A ... A I h )p. 

Further, we obtain that 

E{a) = T c , g (Q)p. 

In presence of slipping and guessing, one cannot expect to solve equation T Ci9 (Q)p = a exactly the 
same way as in the case of no guessing and slipping. On the other hand, thanks to the law of large 
numbers, we obtain that a — > E(a) as N — > oo. Then this equation can be solved asymptotically. 
Thus, for a generic Q' , we defined the loss function 

S c>g (Q') = mf\T c , g (Q')p-a\, (13) 

where the above optimization is subject to the constraint that pj^ £ [0, 1] and ^A?A = 1 anc ^ I ' I 
is the Euclidean normal. In view of the preceding argument, we expect that 

Sc,g(Q) -> (14) 

almost surely as N — > oo, that is, the true Q-matrix asymptotically minimizes the criterion function 
S C) g. This leads us to propose the following estimator of Q 

Q(c, 5 ) = arginfS c , fl (Q'), (15) 
w 

where (c, g) is included in Q to indicate that the resulting estimator requires the knowledge of the 
correct responding and guessing probabilities. 

Situations when c and g are unknown. Suppose that for a given Q' , we can construct an 
estimator (c(Q'),g(Q')) of (c, g). In addition, suppose that (c(Q),g(Q)) is consistent, that is, 
(c(Q),g(Q)) — > (c,g) in probability as N — > oo. Then, we define 

Q £ ,g = arginf 1 S , 6( Q/) )§(Q / ) (Q / ), (16) 

that is, we plug in the estimator of (c,g) into the objective function in (|15p . We will present one 
specific choice of (c, g) in Section [2.2.31 

2.2.2 DINO model 

We now proceed to the description of the estimation procedure of the DINO model. The DINO can 
be considered as the dual model of the DINA model. The estimation procedure is similar except 
that the "AND" relationship needs to be changed to an "OR" relationship. In subsequent technical 
development, we will provide the precise meaning of the duality. First, we present the construction 
of the estimator. 

The U -matrix. The matrix U Ctg (Q) is similar to T cg (Q) except that it admits an "OR" rela- 
tionship among items. In particular, first define FQ(Ii) to be a vector of 2 k dimension and the 
A-th element is defined as £dino{A-,Q). Therefore, Fq{Ii) indicates the attribute profiles that are 
capable of providing positive responses to item i. We use "V" to denote the "OR" combinations 
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among items and define 

F Q (I h V ... V ijj) = E — T} =1 (E - FgC/i .)). 

Thus, Fq(/j 1 V ... V is a vector indicating the attribute profiles that are capable of responding 
positively to at least one of the item(s) ij. We let the row in f7(Q) corresponding to V... V/^ 
be Fg(/j 1 V ... V Jj ; ). In presence of slipping and guessing, we define 

F c , g ,Q(k) = (cj - gi)F Q {Ii) + 5i E 

and 

Fc,g,Q{Ih V ... V Ijj) = E - Tj. =1 (E - /v,,.oi/,J). 

We let the row in U c , g {Q) corresponding to "i^ V ... V ij," be F Cj9i Q(/j 1 V ... V Ijj). 

T/ie f3-vector. The vector /3 plays a similar role as the vector a for the DINA model. Specifically, 
/3 is a column vector whose length is equal to the number of rows of U{Q). Each element of /? 
corresponds to one row vector of U{Q). The element of /3 corresponding to 2^ V ... V is defined 

as 

1 * i 
Ni i v...vij /-W = — 2 1 (there exists a j such that i?^ = 1). 

r=l 

With such a construction and a correctly specified Q, one may expect that 

/3 -> ^c, s (Q)p 

almost surely as iV — > oo. Therefore, we define objective function 

V c , g (Q)=mf\U c , g (Q)p-/3\, (17) 

where inf subject to J2\PA = ^ anc ^ £*A e IP> Furthermore, an estimator of Q can be obtain 
by 

Q(c,g) = MgmfV c , g (Q'). (18) 
In cases when parameters c or g are unknown, we may plug in their estimates and define 

Qc,g = arginf V t{Q , )MQI) {Q'). (19) 

2.2.3 Estimators for the slipping and guessing parameters 

To complete our estimation procedure, we provide one generic estimator for (c,g). For the DINA 
model, we let 

(c(Q),5(Q)) = arg inf S c , g (Q); (20) 

c, 9 €[0,l] m 

and for the DNIO model, we let 

(S(Q),£(Q)) = arg inf V C}9 (Q). (21) 

c,gG[0,l] m 

We emphasize that (c, g) may not be a c onsistent estimator of (c,g). To illustrate this, we present 
one example discussed in iLiu et al.1 f|201lh . Consider the case of m = k items with k attributes and a 



complete matrix Q = X/., the k x k identity matrix. The degrees of freedom of a fe-way binary table is 
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2 k — 1. On the other hand, the dimension of parameters (p, c, g) is 2 k — l + 2k. Therefore, p, c, and g 
cannot be consistently identified without additional information. This problem is typically tackled 
by introducing addition parametric assumptions such as p sati sfying certain functional form or 
i n the Bayesian setting (weakly) informative prior distributions iGelman. Jakulin. Pittau. and Su 
( 20081 ). Given that the emphasis of this paper is the inference of Q-matrix, we do not further 
investigate the identifiability of (p, c, g). Despite the consistency issues, if one adopts the estimators 
in (|20p and (|2ip for the estimator of Q as in (|16p and (|19p . the consistency results remain even 
if (c(Q), g(Q)) is inconsistent. We will address this issue in more details in the remarks after the 
statements of the main theorems. 



2.3 Theoretical properties 
2.3.1 Notation 

To facilitate the statements, we first introduce notation and some necessary conditions that will be 
referred to in later discussions. 



Linear space spanned by vectors V±, Vf. 

C(V l ,...,V l ) = 



i=i 



• For a matrix M, M\-i denotes the submatrix containing the first I rows and all columns of 
M. 

• Vector ej denotes a column vector with the i-th element being 1 and the rest being 0. When 
there is no ambiguity, we omit the length index of e^. 

• Matrix X\ denotes the I x I identity matrix. 

• For a matrix M, C(M) is the linear space generated by its column vectors. It is usually called 
the column space of M. 

• For a matrix M, Cm denotes the set of its column vectors and Rm denotes the set of its row 
vectors. 

• Vector denotes the zero vector, (0, ...,0). When there is no ambiguity, we omit the index 
of length. 

• Define a 2 k dimensional vector 

P= (pa: AG{0,l} fe ). 

• For m dimensional vectors c and g, write c >- g if Cj > g\ for all 1 < i < m and c% g'\i Ci ^ gi 
for all i = 1, m. 

• Matrix Q denotes the true matrix and Q' denotes a generic m x k binary matrix. 
The following definitions will be used in subsequent discussions. 
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Definition 1 We say that T(Q) is saturated if all combinations of the form 2^ A ... A Ii,, for 
I = 1, ...,m, are included in T(Q). Similarly, we say that U(Q) is saturated if all combinations of 
the form 1^ V ... V 1^, for I = 1, m, are included in U(Q). 

Definition 2 We write Q ~ Q' if and only if Q and Q' have identical column vectors, which could 
be arranged in different orders; otherwise, we write Q oo Q' . 

Remark 1 It is not hard to show that "~" is an equivalence relation. Q ~ Q' if and only if they 
are identical after an appropriate permutation of the columns. Each column of Q is interpreted as 
an attribute. Permuting the columns of Q is equivalent to relabeling the attributes. For Q ~ Q' , 
we are not able to distinguish Q from Q' based on data. 

Definition 3 A Q-matrix is said to be complete if \e% : i = 1, ...,&} C Rq (Rq is the set of row 
vectors of Q); otherwise, we say that Q is incomplete. 

A Q-matrix is complete if and only if for each attribute there exists an item only requiring 
that attribute. Completeness implies that m > k. We will show that comp l etenes s is among the 
sufficient conditions to identify Q. In addition, it is pointed out by Chiu et al. ( 20091 ) (c.f. the paper 



for more detailed formulation and discussion) that the completeness of the Q-matrix is a necessary 
condition for a set of items to consistently identify attributes. Thus, it is always recommended to 
use a complete Q-matrix unless additional information is available. 

Listed below are assumptions which will be used in subsequent development. 

CI Matrix Q is complete. 

C2 Both T(Q) and U(Q) are saturated. 

C3 Random vectors Ai, Ajy are i.i.d. with the following distribution 

P(A r = A) = p A ; 
We further let p = (p A :Ae {0, l} k ). 
C4 The attribute population is diversified, that is, p >- 0. 

2.3.2 Consistency results 

We first present the consistency results for the DINA model. 

Theorem 1 Under the DINA model, suppose that conditions Cl-4 hold, that is, Q is complete, 
T(Q) is saturated, the attribute the profiles are i.i.d., and p is diversified. Suppose also that the c 
and g are known. Let S c g (Q') be as defined in f 1 1 3 [) and 

Q(c,g) = arginf S Ci9 (Q 1 ). 
Q 



Then, 



lim P(Q(c,g)~Q) = l. 



In addition, with an appropriate arrangement of the column order ofQ(c,g), let 

p = arginf \T Ci9 (Q(c,g))p' - a\. 

P 
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Then, for any e > 0, 

lim P(|p - p| > e) = 0. 

N— too 

Theorem 2 Under the DIN A model, suppose that the conditions in Theorem [7] hold, except that 
the c and g are unknown. For any Q' , c(Q') and g(Q') are estimators for c and g. When Q = Q' , 
(c(Q),g(Q)) is a consistent estimator of (c,g). Let Qc.g be as defined in (|16p . Then 

lim P{Qt,g ~ Q) = 1. 
In addition, with an appropriate arrangement of the column order of Qc,g, let 

p = arginf ^ .^ JQc^p' - a\. 

Then, for any e > 0, 

lim P(|p-p| > e) =0. 

Af->oo 

In what follows, we present the consistency results for the DINO model. 

Theorem 3 Under the DINO model, suppose that conditions CI -4 hold, that is, Q is complete, 
U(Q) is saturated, the attribute profiles are i.i.d., and p is diversifies. Suppose also that the c and 
g are known. Let V Ct g(Q') be defined as in (fT7|) and 

Q(c,g) = arginf V Ct9 (Q'). 

Then, 

lim P(Q(c,g)~Q) = l. 
In addition, with an appropriate arrangement of the column order ofQ(c,g), let 

p = arginf \U Ct9 (Q(c, g))p' - (3\. 

Then, for any e > 0, 

lim P(|p — p| > e) = 0. 

Theorem 4 Under the DINO model, suppose that the conditions in Theorem hold, except that 
the c and g are unknown. For any Q' , c(Q') and g(Q') are estimators for c and g. When Q = Q' , 
c(Q) and g(Q) are consistent estimators of c and g. Let Qc,g be defined as in (|19p . Then 

lim P{Q^ g ~ Q) = 1. 

TV— >oo 

In addition, with an appropriate arrangement of the column order of QcQ> let 

p = arginf ' „ , (/: ,^ .(Q, ,,)l>' " P\- 

Then, for any e > 0, 

lim P(|p — p| > e) = 0. 
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Remark 2 It is not hard to verify that "~" defines a binary equivalence relation on the space 
of m x k binary matrices, denoted by M. m ,k- As previously mentioned, the data do not contain 
information about the specific meaning of the attributes. Therefore, we do not expect to distinguish 
Qi from Q2 if Qi ~ Q2- Therefore, the identifiability in the theorems is the strongest type that one 
may expect. The corresponding quotient set is the finest resolution that is possibly identifiable based 
on the data. Under weaker conditions, such as in absence of completeness of the Q-matrix or the 
complete diversity of the attribute distribution, the identifiability of the Q-matrix may be weaker, 
which corresponds to a coarser quotient set. 

Remark 3 We would like to point out that, when the estimators in (I20p and (|21f) are chosen, Qt,g 
is always a consistent estimator of Q, even if(c,g) is not a consistent estimator for (c,g). This 
is because the proof of Theorem^ is based on the fact that Sc(Q),g(Q)(Q) —> in probability; when 
Q' 00 Q ; S£(Qi) t tj(Q/)(Q') is bounded below by some 5 > 0. Given that S Ctg (Q) — > and that (c,g) is 
chosen to minimize the objective function S , Sc(Q),g(Q)(Q) decreases to zero regardless whether or 
not (c,g) is consistent. In addition, the fact that £>c(Q'),g(Q')(Q') ^ s bounded below by some 5 > 
does not require any consistency property of (c, g) . Therefore, the consistency of Qz,g does not rely 
on the consistency of (c, g) if it is of the particular forms as in (|20p and (|2ip . On the other hand, 
in order to have p being consistent, it is necessary to require the consistency for (c,g). Therefore, 
in the statement of Theorem^ we require the consistency of (c,g), though it is necessary to point 
out this subtlety. A similar argument applies to Theorem [7] as well. 



3 Discussions and implementation 

This paper focuses mostly on the estimation of the Q-matrix. In this section, we discuss several 
practical issues and a few other usages of the proposed tools. 



Computational issues. There are several aspects we would like to address. First, for a given 
Q, the evaluation of S cg (Q) only consists of optimization of a quadratic function subject to linear 
constraint (s). This can be done by quadratic programming type of well established algorithms. 

Second, the theories require construction of a saturated T-matrix or [/-matrix which is 2 m — 1 
by 2 k . Note that when m is reasonably large, for instance, m = 20, a saturated T-matrix has over 
1 million rows. One solution is to include part of the combinations and gradually include more 
combinations if the criterion function admit small values at multiple Q-matrices. Alternatively, we 
may split the items into multiple groups which we will elaborate in the next paragraph. 

The third computational issue is related to minimization of S C; g(Q) with respect to Q. This 
involves evaluating function S over all the m x k binary matrices, which has a cardinality of 2 mxk . 
Simply searching through such a space is a substantial computation overhead. In practice, one may 
want to handle such a situation by splitting the Q-matrix in the following manner. Suppose there 
are m items. We split them into I groups, each of which has mo (a computationally manageable 
number) items. This is equivalent to dividing a large Q-matrix into multiple smaller sub-matrices. 
When necessary, we may allow different groups to have overlaps of items. Then, we can estimate 
each sub-matrix separately and merge them into an estimate of the big Q-matrix. Given that 
the asymptotic results are applicable to each of the sub- matrices, the co mbined est i mate is also 
consistent. This is similar to the splitting procedure in Chapter 8.6 of iTatsuokal ifcood ). We 
emphasize that splitting the parameter space is typically not valid for usual statistical inferences. 
However, the Q-matrix admits a special structure with which the splitting is feasible and valid. 
This partially helps to relieve the computation burden related to the proposed procedure. On the 
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other hand, it is always desirable to have a generic efficient algorithm for a general large scale 
Q-matrix. We leave this as a topic for a future investigation. 

Partially specified Q-matrix. It is often reasonable to assume that some entries of the Q- 
matrix are known. For example, suppose we can separate the attributes into "hard" and "soft" 
ones. By "hard", we mean those that are concrete and easily recognizable in a given problem and, 
by "soft", we mean those that are subtle and not obvious. We can then assume that the entry 
columns which correspond to the "hard" attributes are known. Another instance is that there is a 
subset of items whose attribute requirements are known and the item-attribute relationships of the 
other items need to be learnt, such as the scenarios when new items need to be calibrated according 
to the existing ones. In this sense, even if an estimated Q-matrix may not be sufficient to replace 
the a priori Q-matrix provided by the "expert" (such as exam makers), it can serve as a validation 
as well as a source of calibration of the existing knowledge of the Q-matrix. 

When such information is available and correct, the computation can be substantially reduced. 
This is because the optimization, for instance that in f)16[) . can be performed subject to the existing 
knowledge of the Q-matrix. In particular, once a set of items is known to form a complete Q-matrix, 
that is, item i is known to only require attribute i for i = 1, k, then one can calibrate one item 
at a time. More specifically, at each time, one can estimate the sub-matrix consisting of items 1 
to k as well as one additional item, the computational cost of which is 0(2 k ). Then the overall 
computational cost is reduced to 0(m2 k ), which is typically of a manageable order. 

Validation of a Q-matrix. The propose framework is applicable to not only the estimation of 
the Q-matrix but also validation of an existing Q-matrix. Consider the DINA and DINO models. 
If the Q-matrix is correctly specified, then one may expect 

\a ~ T £t g(Q)p\ -»■ 

in probability as iV — > oo. The above convergence requires no additional conditions (such as 
completeness or diversified attribute distribution). In fact, it suffices to have that the responses are 
conditionally independent given the attributes and (c,g) are consistent estimators of (c, g). Then, 
one may expect that 

If the convergence rate of the estimators (c, g) is known, for instance, (c — c, g — g) = O p (n~ 1 / 2 ), 
then a necessary condition for a correctly specified Q-matrix is that Ss§(Q) = O p {n~ 1 / 2 ). The 
asymptotic distribution of S depends on the specific form of (c, g). Consequently, checking the 
closeness of S to zero forms a procedure for validation of the existing knowledge of the Q-matrix. 

4 Proofs of the theorems 

4.1 Preliminary results: propositions and lemmas 

Proposition 1 Under the setting of the DINA model, suppose that Q is complete and matrix 
T(Q) is saturated. Then, we are able to arrange the columns and rows of Q and T{Q) such that 
T(Q)i-.(2 k -i) has rank 2 k — 1, that is, after removing one zero column this sub-matrix has full column 
rank. 

Proof of Proposition [TJ We let the first column of T(Q) correspond to the zero attribute 
profile. Then, the first column is a zero vector, which is the column we mean to remove in the 
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statement of the proposition. Provided that Q is complete, without loss of generality we assume 
that the i-th row vector of Q is ej for i = 1, k, that is, item i only requires attribute i for each 
% = l,...,k. The first 2 k - 1 rows of T(Q) are associated with {I\, J^}. In particular, we let the 
first k rows correspond to I\, I& and the second to the (k + l)-th columns of T(Q) correspond to 
A's that only have one attribute. We further arrange the next C\ rows of T(Q) to correspond to 
combinations of two items, IiAlj, i 7^ j. The next C\ columns of T(Q) correspond to A's that only 
have two positive attributes. Similarly, we arrange T(Q) for combinations of three, four, and up to 
k items. Therefore, the first 2 k — 1 rows of T(Q) admit a block upper triangle form. In addition, 
we are able to further arrange the columns within each block such that the diagonal matrices are 
identities, so that T(Q) has form 
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T(Q)i-.(2 k -i) obviously has full rank after removing the zero (first) column. ■ 

From now on, we assume that Q\-k = Ik an d the first 2 k — 1 rows of T(Q) are arranged in the 
order as in (|22|) . 

Proposition 2 Under the DIN A model, that is, the ability indicator follows ([JJ, assume that Q 
is a complete matrix and T(Q) is saturated. Without loss of generality, let Q\-k = Ik- Assume 
that the first k rows of Q' form a complete matrix. Further, assume that Q\± = Q[. k = Ik- If 
Q' 7^ Q and c ^ g, then for all c' € W n there exists at least one column vector ofT c ^ g (Q) not in the 
column space C(T C /(Q')), where T c i(Q') is as defined in (|12[) being the T -matrix with zero guessing 
probabilities. 

Proposition 3 Under the DINA model, that is, the ability indicator follows ([T]), assume that Q is 
a complete matrix and T(Q) is saturated. Without loss of generality, let Q\-k = Ik- If c % g and 
Q'l.u is incomplete, then for all c' € R m there exists at least one nonzero column vector of T Cy9 {Q) 
not in the column space C(T C /(Q')). 

In the statement of Propositions [2] and El Cj, gi, and c\ can be any real numbers and are not 
restricted to be in [0,1]. Propositions [2] and [3] are the central results of this paper, whose proofs 
are delayed to the Appendix. To state the next proposition, we define matrix 

f c , 9 (Q) = ( Tc i Q) ) > (23) 
that is, we add one more row of one's to the original T-matrix. 

Proposition 4 Under the DINA model, that is, the ability indicator follows ([T|), suppose that Q is 
a complete matrix, Q' ^ Q, T is saturated, and c ^ g. Then, for all c, g, c', g' £ [0, l] m , there exists 
one column vector of T C ^(Q) (depending on c,g,c',g') not in C{T c i g i(Q')). In addition, T c ^ g (Q) is 
of full column rank. 

Lemma 1 Consider two matrices T\ and T2 of the same dimension. If C{T\) C C{T2), then for 
any matrix D of appropriate dimension for multiplication, we have 

C(DTi) C C(DT 2 ). 
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Conversely, if the l-th column vector of DT\ does not belong to C(DT2), then the l-th column 
vector of Ti does not belong to CiTz). 

Proof of Lemma [T]. Note that DTi is just a linear row transform of Tj for i = 1,2. The 
conclusion is immediate by basic linear algebra. ■ 

Proof of Proposition |4l According to Propositions [2] and [3] and Lemma [H it is sufficient to 
show that there exists a matrix D such that 

Df Ct9 {Q) = T c _ gW (Q), DT d j(C#) = r c ,_ s ,, (Q') 4 T c , (Q 1 ), 

a 

where d , = c' — g' . Once we obtain such a linear transformation, according to Propositions [2] and [3j 
there exists a column vector in T c _„/ ,g- g '(Q) that is not in the column space of T c i (Q'), as long as 

Q oo Q' . Then the same column vector in T Cj9 (Q) is not in the column space of T c ^ g i(Q'). Thereby, 
we are able to conclude the proof. 

In what follows, we construct such a D matrix. Let g* = (<;*, g^J. We show that there exists 
a matrix D g * only depending on g* so that D g *T Ctg {Q) = T c _ 9 * j3 _ 9 * (Q). Note that each row of 
D g *T Ct g(Q) is just a row linear transform of T c ^ g {Q). Then, it is sufficient to show that each row 
vector of T c _ 9 * j9 _ 9 * (Q) is a linear transform of rows of T Ct9 (Q) with coefficients only depending on 
g*. We prove this by induction. 

First, note that 

B c -g*,g-g*,Q{Ii) = -B C , 5 ,q(^) ~ 5j*E- 

Then all row vectors of T c „ 5 * j5 _ 9 * (Q) of the form i? c _ 9 * i9 _ 9 * 5 Q(/j) are inside the row space of T c>g {Q) 
with coefficients only depending on g*. Suppose that all the vectors of the form 

B c-g*,g-g*,Q{Ih A ... A I it ) 

for all 1 < I < j can be written linear combinations of the row vectors of T c ^ g (Q) with coefficients 
only depending on g* . Then, we consider 

B c , g , Q (I h A ... A = T{ + = \ (B c _ g * } g_ g * tQ (r ih ) + g* h E) . 

The left hand side is just a row vector of T c ^ g (Q). We expand the right hand side of the above 
display. Note that the last term is precisely 

Bc-g*,g-g*,Q(Ih A ... A = B c _ g * |ff _ fl * ,Q ( J ifc ) . 

The rest terms are all of the form B c _ g * >g -g* ,q(^i A ... A 1^) for 1 < I < j multiplied by coefficients 
only depending on <?*. Therefore, according to the induction assumption, we have that 

B c-g*,g-g*,Q(Ih A ... A 

can be written as linear combinations of rows of T c ^ g (Q) with coefficients only depending on g*. 
Therefore, we can construct the matrix D g * accordingly. Lastly, we choose g* = g' and conclude 
that 

Dg>f Ct9 (Q) = T C _ 9W (Q), D g ,T c >,g>(Q') = Tc> (Q'). 

a 

By Propositions [2] and El there exists a column vector of T c _ g r g _ g /(Q) not in the column space of 
T c i (Q 1 ). Furthermore, according to Lemma [TJ we conclude the first part of the Proposition. 
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In addition, consider D g T c ^ g (Q) = T Cg (Q) where c g = c — g ^ 0. By construction as in (f22j) . 
after removing the first zero column, T Cg {Q) is of rank 2 k — 1. Therefore, the matrix 

( Tc i Q) 

is of full rank. Note that each row of the above matrix is a linear transform of T c ^ g (Q). Thus, 
T c ,g(Q) is a full rank matrix too. Thereby, we conclude the proof of the proposition. ■ 
For the DINO model, we define a similar matrix 

Uc, 9 (Q) = ( Uc i Q) ) > (24) 

and collect the following proposition. 

Proposition 5 Under the setting of the DINO model, that is, the ability indicator follows ([2]), 
suppose that Q is a complete matrix, Q' oo Q ; JJ is saturated, and c ^ g. Then, for all c,g,c',g' € 
[0, l] m , there exists one column vector of U Ctg (Q) not in C{U C > . g '(Q')). In addition, U c ^ g {Q) is of 
full column rank. 

Lemma 2 Let T(Q) be the T -matrix under the DIN A model with c = 1 and g = and U (Q) be 
the U -matrix under DINO model with c = 1 and g = 0. We are able to arrange the column order 
of T(Q) and U(Q) so that 

T(Q) + U(Q) = E, 

where E is a matrix of appropriate dimensions with all entries being one 's. 

Proof of LemmaHl Consider a Q-matrix, an attribute profile A, and an item i. Let A c = E — A 
be the complimentary profile. Suppose that Qij = 1 for 1 < j < n and Qij = for n < j < k. Under 
the DINO model, £ l DINO (A, Q) = 1 if A J = 1 at least for one 1 < j < n. For the same j, (A C ) J = 
and therefore ^ /Afj4 (A c ,Q) = 0. That is, ^ DINO (A,Q) = 1 implies that £^ /7Vj4 (A c , Q) = 0. 
Similarly we are able to obtain that £ l DINO (A,Q) = implies that £ l DINA (A c , Q) = 1. Therefore, 
if we arrange the columns of T{Q) and U(Q) in such a way that the A-th column of U(Q) and the 
A c -th column of T(Q) have the same position, then 

B Q (I l ) + F Q (I l ) = B, 

for all 1 < i < m. Note that 

Bq{I\ A ... A Ii) = 



Thus, we conclude the proof. ■ 

Proof of Proposition [5], Thanks to Propositions [2] and [3] and Lemma [H it is sufficient to show 
that with an appropriate order of the columns of U c , g {Q) there exists a matrix D' c , only depending 
on d = (c[, ■■■,d m ) (independent of Q) such that 

D' c ,U c , g {Q) = T C ,^ C ,_ C (Q) 



TUBgili) 

tU(e-f q (/ 4 )) 

E-Fnth V...Vij)- 
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for all m x k binary matrix Q. To establish that, we only need to show that each row vector of 
T c >-g t c'-c(Q) can be written as a linear combination of the row vectors of U c ^ g (Q). In addition, the 
coefficients only depend on the d and are independent of c, g, and Q. 

We establish this by induction. By construction, we have that for each % = 1, ...,m 

E - F c>g , Q {Ii) = (1 - Ci)E + (a - 9i )(E - F Q (Ii)). 

Note that each column of U (and T) and each element in Fq(Ii) (and Sq(Jj)) correspond to one 
attribute profile A £ {0, l} fc . If we arrange the A-th position of Fq{Ii) and A c position of BQ(Ii) 
to be the same, then from the proof of Lemma[2]we obtain that Sg(Ij) = E — Fq^Ii). Therefore, 
E - F Ct g tQ (Ii) = Bi_ g)1 _ C; Q(/j). Similarly, we obtain that 

E-F c , 9iQ (/ n V...V/0 = Tj- =1 (E — Fc,g,Q(Iij)) 

= ^•t 1 lJ Bi_ s , i i_ Ci Q(Ij j .) 

= #L- g ,l-c,Q(-fu A ... A ij,), 

where 1 — c = (1 — ci, 1 — c m ). Let E be the matrix with all entries being one's. We essentially 
established that 

E - U Cj3 (Q) = Tx- g ,i- c (Q). 
We use the matrix D g * constructed in Proposition [4] and obtain that 

F>\-d ( E " U £ g{Q) ) = D^f^g^iQ) = T d _ gtd _ c {Q). 
Similarly, we have that 

D W ( E " ^f'^') \ = D^f^^iQ') = T c ,_ g ,(Q'). 

Note that E is a row vector of both U Cj g(Q) and U c ' >g /(Q'). Therefore, one can construct a matrix 
D' c , so that 

D' c ,U c ,g{Q) = T C ,^ C ,_ C (Q), D' c ,U c ,,g,(Q') = T d _ g ,(Q'). 

Thanks to Propositions [2] and [31 there exists a column vector of T c i_ g ^ c '_ c (Q) not inside the 
column space of T c >- g /(Q') whenever c ^ g. Thanks to Lemmas Q] and [21 the corresponding column 
vector (s) of U c , g {Q) is not inside the column space of U c ',g>(Q')- In addition, note that 

/ T c '-g,c'-c(Q) \ 

I e ; 

is of full column rank (Proposition |4]) and can be obtained by a row transformation of U Cj g(Q). 
Therefore, U Ct g(Q) is also of full column rank. Thereby, we conclude the proof. ■ 

4.2 Proof of the theorems 

Proof of Theorem [TJ. Notice that the true parameters c and g form consistent estimators for 
themselves. Therefore, Theorem [1] is a direct corollary of Theorem [2j ■ 
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Proof of Theorem [2j By the law of large numbers, 

\T c>g {Q)p-a\ ^0 

almost surely as N — > oo. Therefore, 

S C ,g(Q) "> 

almost surely as N — > oo. Note that S C) g{Q) is a continuous function of (c,g). The consistency of 
(c(Q),g(Q)) implies that 

S c(Q),g(Q)(Q) -> 0) 

in probability as TV — > oo. 
For any Q' oo Q, note that 



r c , 9 (Q)P 



According to Proposition [J] and the fact that p >- 0, there exists 5(c',g') > such that 5(c',g') is 
continuous in (c',g') and 



inf 
P' 



By elementary calculus, 



and 



T c , g ,(Q')p'-T c JQ)p >5(c',g') 



5^ inf 5(c',g')>0 

c',g'G[0,l] m 



inf 

,g'G[0,l] m ,P' 



T c ',g'(Q')p' ~ f Ct g(Q)p 



> 5. 



Therefore, 



P ( inf 

c',s'e[o,i] m ,p' 



> 8/2 1 



as iV — > oo. For the same 8, we have 



P(^(q , § (q (QO > 8/2) > P( c Jni i]m S cW (Q') > 6/2) = P [ c , g ,^ ]mpl fa AW ~ a \> ^ 
The above minimization in the last probability is subject to the constraint that 

AG{0,l} fe 

Together with the fact that there are only finitely many m x k binary matrices, we have 

P(Qc,g ~ Q) = l. 

We arrange the columns of Qe,g so that P(Qa,g = Q) — > 1 as iV — > oo. 
Now we proceed to the proof of consistency for p. Note that 



T. 



c(Qc,g),g(Qc,g) 



-c(Q),g{Q) 



(Q)P 



0. 



0. 
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Note that T cg (Q) is a full column rank matrix, P(Qc,g = Q) —> 1, c(Q) — > c, g(Q) — > g, and T Cj9 is 
continuous in (c,g). Then, we obtain that p — s> p in probability. ■ 

Proof of Theorem [3l Similar to Theorem [IJ Theorem [3] is a direct corollary of Theorem [U ■ 

Proof of Theorem [4j The proof of Theorem [5] is completely analogous to that of Theorem [2j 
Therefore, we omit the details. ■ 



A Technical proofs 

Proof of Proposition [2j Note that Q\± = Q[. k = X\~. Let T(-) be arranged as in (|22|) . 

Then, r(Q) 1; ( 2 fe_i) = ^ 1 (Q')i:(2 fe -i)- Given that Q ^ Q' , we have T(Q) ^ T(Q'). We assume that 
T(Q)u 7^ T(Q')n, where T(Q)u is the entry in the l-th. row and z-th column. Since T(Q) 1 .^ 2 k_i- ) = 
T((3') 1: (2fc_i), it is necessary that I > 2 k . In addition, we let the l-th row correspond to a single 
item (not combinations of multiples). 

Suppose that the Z-tli row of the T{Q') corresponds to an item that requires attributes i\, iy. 
Then, we consider 1 < h < 2 k — 1, such that the h-th row of T(Q') is Bq^I^ A ... A Ii t ,)- Then, the 
/i-th row vector and the l-th row vector of T(Q') are identical. 

Since T(<3) 1: ( 2 fc_i) = T(Q')i :(2 ^-i), we have T{Q) hj = T(Q') hj = T(Q% for j = l,...,2 k - 1. 
If T{Q) U = and T(Q% = 1, the matrices T(Q) and T(Q') look like 



T(Q') 



row h 



row / 



and 



T(Q) 



row h 



row / 



column % 

I 

/ 1 * ... * 



* 1 * 

y o * * * 



column i 

I 

( X * ... 
: : : X 







* 



Case 1 The h-th and l-th row vectors of T c i{Q') are nonzero vectors. 
Consider the following two submatrices 



Mi 



T c , g {Q)hi T C:9 (Q) 



Ih2 k 



T c ,g(Q)u T C; g(Q) l2 k 



,M 2 



T c i{Q')hi 



T c '(Q')h2 k 
Tc'(Q')l2 k 
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By construction that T(Q')hi = T(Q')u for all i, all column vectors of M 2 are proportional to 
each other. In what follows, we identify one column of T Cj5 (Q) that is not in the column space 
of T c i(Q'). Also, it is useful to keep in mind that the 2 fc -th (last) column of T corresponds to 
the attribute profile (1, 1). 

al If T(Q)n = and T{Q)hi = 1, then T Cjff (Q)/jj = T Ct9 (Q) h2 k. Since c ^ g, we obtain that 
T c ,g{Q)u 7^ T C) g(Q) l2 k. There are two situations: 

bl T c ^ g (Q)hi = T Cj g(Q) h2 k 7^ 0. It is straightforward to see that the column space 
of M 2 does not contain both column vectors of M\. This is because T c ^ g (Q)hi = 
T Cj g(Q) h2 k 7^ and T Ct g(Q)n 7^ T c ^ g (Q) l2 k imply that the two column vectors of M\ 
are not proportional to each other. Then, either the i-th column or the 2 fc -th column 
of T Ct g(Q) is not in the column space of T C >(Q'). 

b2 T c ^ g (Q)hi = T Cj g(Q) h2 k = 0. T CjS (Q);j 7^ T c ^ g (Q) l2 k implies that at least one of them 
is nonzero. Suppose that T Ctg (Q)n 7^ 0, then the i-th column of T c ^ g {Q) is not in the 
column space of T c i(Q'). This is because the h-th. row of T C >(Q') is not a zero vector 
and any vector of the form 



nonzero 



(25) 



is not in the column space of the M 2 . Similarly, if T Ctg (Q) l2 k 7^ 0, then the 2 fc -th 
column is identified. 

a2 If T(Q)n = 1 and T(Q)hi = 0, then T C) g(Q)n = T Cj9 (Q) l2 k. Note that row h corresponds 
to a combination of items (or just one item) each of which only requires one attribute. 
Therefore, we may choose column i such that the corresponding attribute is capable 
of answering all items in row h except for one. With this construction, if T^g^Q)^ = 
T Ct g(Q) h2 k, then they must be both zero (most of the time T Cjg (Q)hi and T Cj9 (Q) h2 k are 
distinct). We consider three situations: 

cl T Ct g(Q)hi j^z T Ct g(Q) h2 k. Similar to al, the conclusion is straightforward. 

c2 T Ct g(Q) hi = T Ct9 (Q) h2 k = and T C) g(Q)u = T c ^ g {Q) l2 k ^ 0. Similar to b2, since the 
h-th. row vector of T C /(Q') is nonzero, the statement of the proposition also holds. 

c3 T Ct g(Q) hi = T Cjg (Q) h2 k = and T c> g(Q) u = T c ^ g {Q) l2 k = 0. This situation is slightly 
complicated, since M\ is a zero matrix and we have to seek for a different column 
other than the columns i and 2 k . In what follows, all the item-attribute relationship 
refers to Q. If the item in the Z-th row does not require strictly fewer attributes 
than the items in row h, then, we are able to find a column as in al. 

Otherwise, the item in the l-th DOES require strictly fewer attributes than the 
items in row h. Without loss of generality, assume that the item corresponding 
to the Z-tli row requires attribute 1,2, and the h-th row corresponds to items 
1,2, ...,f. Suppose that for all i' = 1, 2 k T c , g (Q)ui = implies T c ,g{Q) hi , =0 
(otherwise the i'-th column is not in the column space of T c i(Q), c.f. b2). By slightly 
abusing notation, we let q = be the correct answering parameter and gi 7^ be 
the guessing parameter of the item in the l-th row of T c ^ g {Q). Let A = e,'. Then, 
the A-th element of the i-th row is g\ 7^ (equivalently, A is NOT able to answer 
that item). 
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dl Suppose that the A-th element of the h-th row of T c ^ g {Q) is non-zero. Let 
= (0, ...,0) be the (zero) attribute that has precisely one few attribute than 
A. Then, the 0-th element of the Z-th row of T Ci5 (Q) equals the A-th element of 
that row (being gi). The 0-th and the A-th elements of the h-th row of T c ^ g (Q) 
are different. This is because the 0-th and the A-th elements of the h-th. row 
are 

f j'-i 
J 9i>, Cj> Y\ 9i>- 

i'=l i'=X 

Thereby, we can identify the vector from either the A-th or the 0-th column 
vector. (Note that the 0-th column of T Ct9 (Q) is the first column, which is not 
a zero vector.) 

d2 Suppose that the A-th element of the h-th row of T Cjg (Q) is zero. Then, the 
A-th column is not in the column space of T c i(Q), because its l-th element is 
nonzero and the h-th element is zero (c.f. b2). 

Case 2 Either the h-th or l-th row vector of T C /(Q') is a zero vector. Since both the h-th and l-th 
rows of T Cj9 (Q) are nonzero vectors, we are always able to identify a column in T c ^ g (Q) that 
is not in the column space of T c i{Q ! ). 



Proof of Proposition [3l 
Step 1 

We first identify two row vectors such that they are identical in T(Q') but distinct in T(Q). It 
turns out that we only need to consider the first k items. Consider Q' such that Q'i. k is incomplete. 
We discuss the following situations. 

1. There are two row vectors, say the i-th and j-th row vectors (1 < i,j < k), in Q\.u that are 
identical. Equivalently, two items require exactly the same attributes according to Q'. Then, 
the row vectors in T(Q') corresponding to these two items are identical. All of the first 2 k — 1 
row vectors in T(Q) must be different, because T{Q) l .^ 2 k_i^ has rank 2 k — 1. 

2. No two row vectors in Q\.u are identical. Then, among the first k rows of Q' there is at least 
one row vector containing two or more non-zero entries. That is, there exists 1 < i < k such 
that 

k 

This is because if each of the first k items requires only one attribute and Q'i-u is not complete, 
there are at least two items that require the same attribute. Then, there are two identical 
row vectors in Q' vk and it belongs to the first situation. We define 

k 

ai = ^Q' ij , 
i=i 

the number of attributes required by item i according to Q' . 
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Without loss of generality, assume aj > 1 for i = l,...,n and Oj = 1 for i = n + 
Equivalently, among the first items, only the first n items require more than one attribute 
while the (n + l)-th through the fc-th items require only one attribute each, all of which are 
distinct. Without loss of generality, we assume Q' u = 1 for i = n + 1, k and Qij = for 
i = n + 1, k and i ^ j. 

(a) n = l. Since a\ > 1, there exists i > 1 such that = 1. Then, the row vector in 
T(Q') corresponding to 1\ A l{ (say, the Z-th row in T(Q')) and the row vector of T(Q') 
corresponding to I\ are identical. On the other hand, the first row and the l-ih row are 
different for T(Q) because r(Q) 1:( - 2 fc_ 1 ^ is a full-rank matrix. The above statement can 
be written as 

B Q ,{h A h) = B Q ,(h), B Q (h A h) + B Q (h). 

(b) n > 1 and there exists j > n and i < n such that Q'^ = 1. Then by the same argument 
as in (|'2a|) . we can find two rows that are identical in T(Q') but different in T(Q). In 
particular, 

Bq^Ij A h) = B Q ,(Ii), B Q (Ij A h) + B Q (Ii). 

(c) n > 1 and for each j > n and i < n, Q[ - = 0. Let the i*-th row in T(Q') correspond to 
I\ A ... A I n . Let the ijlj-th row in T(Q') correspond to I\ A ... A Ih-i A Ih+i A ... A I n for 
h = 1, n. 

We claim that there exists an h such that the i*-th row and the i^-th row are identical 
in T(Q'), that is 

B Q/ (hA, M h _ x A 4 + i A, M n ) = Bqi (IiA, A/ n ). (26) 

We prove this claim by contradiction. Suppose that there does not exist such an h. This 
is equivalent to saying that for each j < n there exists an a,- such that Q'- n = 1 and 
Qiaj = for all 1 < i < n and i ^ j. Equivalently, for each j < n, item j requires at 
least one attribute that is not required by other first n items. Consider 

Ci = {j : there exists i < i' < n such that Q\ij = 1}- 

Let #(•) denote the cardinality of a set. Since for each i < n and j > n, Q\ - = 0, we 
have that #(Ci) < n. Note that Q' lai = 1 and Q' iai = for all 2 < i < n. Therefore, 
a\ € C\ and ai ^ C^- Therefore, #(C 2 ) < n — 1. By a similar argument and induction, 
we have that a n = #(C n ) < 1. This contradicts the fact that a n > 1. Therefore, there 
exists an h such that (|26p is true. As for T(Q), we have that 

B Q (h/\, AJft_i A I h+1 A, Al re ) / B Q (hA, Al re ). 

Step 2 

For the situations 1, 2a, and 2b, the identification of the column vector is completely identical to 
that of the Proposition [2j For those three situations, we essentially identified one row corresponding 
to a single item and another row corresponding to a combination of single-attribute items. We need 
to provide additional proof for situation 2c, that is, the follow-up analysis whence ([26]) is established. 
Without loss generality, we assume that 

£ q ,(IiA,...,AI„-i) = B Q/ (hA,...,Al n ), B Q (hA, Al„_i) ^ B Q (hA, Al n ). (27) 
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Let h be the row corresponding to I±A, Al n _i and I be the row to I\A, Al n . 

a Both the h-th and the l-th row of T c r(Q') are nonzero. Among the first 2 n elements of 
Bc,g,Q(IiA, A/„_i) there exists a nonzero element, say corresponding to attribute A. Let 
A' be the attribute identical to A for the first n — 1 attributes and their ra-th elements are 
different. Then, the A-th and A'-th elements of -B c ,g,Q(ii A, A/ n _i) are identical (and 
nonzero). The A-th and A'-th elements of 5 Cj9i q(/iA, Al n ) must be different. This is 
because A-th and A'-th elements of -B Cj5i q(7iA, Al n ) are the products of the corresponding 
elements in B c ^^q(I\A, Al„_i) with q and gi respectively and q 7^ 57. Then, either the 
A-th or the A'-th column of T cg (Q) is not in the column space of T C /(Q'). 

b Either the h-th or the l-th row of T C /(Q') is a zero vector. The identification of the column 
vector is straightforward. 
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